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iniri n f|p THff INVENTION 
5 The present invention is directed to an improvement in computing systems and in particular to 
improved query execution in query processing systems, such as relational data bases. 

In query processing systems, such as relational data base management systems (RDBMS), data 
values are extracted from stored images of the data for further processing by the query evaluation 
10 system. Typically, and in RDBMS systems, the data is structured as rows comprised of column 
values. The rows are grouped into contiguous storage blocks known as pages. A part of query 
evaluation in such systems is the isolation of successive rows and extraction of a subset of the 
column values for the row. These values axe used in query evaluation steps which may include one 
or more of filtering, sorting, grouping, joining, or other relational or data manipulation steps. 
15 copying the data values from data pages involves the step of identifying and locating in main 
memory the page containing the row of interest, locating tbe row within the page and locating the 
column values within the row. The column values are then copied to new locations in memory 
where they are made available for query evaluation. Typically a page may be located in the main 
memory of a computer or located on a secondary storage device, typically a computer disk. In query 
20 evaluation systems which support concurrent query executions, the page containing data must be 
"stabilized" to ensure that it remains at the samelocationmmemory and to prcvcmccn^irent reads 
and updates to the page to preserve the logical integrity of the page contents, while it i* being 
accessed by a particular process or thread. After copying column data values to a new location, the 
page stabilization is ended (released). The steps of locating the page, stabilizing the page, locating 
25 a row in the page, and releasing the stabilization for each row to be processed by the query 
evaluation system, may constitute a significant portion of the overall execution cost of a query. 

In the prior art, techniques are known to unprove the effiriency of query execution by reducing the 
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repeated location and stabilization of pages containing data. Efficiencies may be realized where a 
page is located and then remains stabilized forsuccessiveread operations. These approaches include 
search argument processing in which an evaluation of a retrieved row relative to a query predicate 
is performed prior to the release of the stabilization of the page containing the row. Where the row 
5 being retrieved does not meet the query predicate condition, the next row is retrieved from the page. 
This method permits multiple rows to be read from the page without releasing and reacquiring the 
stabilization on the page. However, where rows are located in the page which do satisfy the query 
predicate, the data values are copied for processing by the query processor and the stabilization on 
the page is released. Thus where the query predicate is satisfied, there is no increased efficiency 
10 resulting from the adoption of the above approach. 

It is therefore desirable to have a query execution technique which provides increased efficiency of 
the query execution where query predicates are satisfied by retrieved rows. 

SUMMARY O P THE IN VENTION 

According to one aspect of the present invention, there is provided a query execution in a query 
is processing system having improved efficiency. 

According to another aspect of the present invention, there is provided a method for processing a 
database query in a database management system comprising a data manager, a set of data, a query 
processor, and a buffer, the method comprising the steps of: 

the query processor calling the data manager to request the return of data from the set of data, 

20 die data manager accessing the set of data to locate query-specified data and determining if 

the query-specified data is to be ignored, consumed, or returned to the query processor, 

where the data manager determines that the query-specified data is to be returned to the 
query processor, the data manager writing the query-specified data to the buffer, the data 
manager repeating access to the set of data until all data requested by the query processor has 
25 been accessed, and 
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the query processor retrieving query-specified data from the buffer. 

According to another aspect of the present invention, there is provided the above method, in which 
the set of data is stored on pages and the method further comprises the step of the data manager 
stabilizing the page on which the query-specified data is located prior to accessing said data, the 
method further comprising the step of maintaining the stabilization of the page while writing to the 
buffer and until all data requested by the query processor on the page has been accessed 

According to another aspect of the present invention, there is provided a program storage device 
readable by a machine^ tangibly embodying a program of instructions executable by the machine to 
perform the above method steps. 

According to another aspect of the present invention, there is provided a computer program product 
for a database management system comprising a data manager, a query processor and a buffer for 
processing a database query a set of data, the computer program product comprising a computer 
usable medium having computer readable code means embodied in said medium, comprising 

computer readable program code means for the query processor to call the data manager to 
request the return of data from the set of data, 

computer readable program code means for the data manager to access the set of data to 
locate query-specified data and to determine if the query-specified data is to be ignored, 
consumed, or returned to the query processor, 

computer readable program code means for, where the data manager determines that the 
query-specified data is to be returned to the query processor, the data manager to write the 
query-specified data to the buffer, and for the data manager to repeat access to the set of data 
until all data requested by the query processor has been accessed, and 

computer readable program code means for the query processor to retrieve query-specified 
data from the buffer. 

According to another aspect of the present invention, there is provided the above computer program 
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product, in which die set of data is stored on pages and in which the computer usable medium having 
computer readable code means embodied in said medium, further comprises computer readable 
program code means for the data manager to stabilize the page on which the query-specified data 
is located prior to accessing said data, and for the data manager to maintain the stabilization of the 
5 page whil e writing to the buffer and until all daut requested by the query processor on the page has 
been accessed 

According to another aspect of the present invention, there is provided a query processing system 
comprising a data manager, a query processor and a buffer for processing a database query on a set 
of data> 

1 o the query processor comprising means for calling the data manager to request the return of 

data from the set of data, 

the data manager comprising means for accessing the set of data to locate query-specified 
data and for determining if the query-specified data is to be ignored, consumed, or returned 
to the query processor, 

1 5 the data manager comprising means for, where determining that the query-specified data is 

to be returned to the query processor, writing the query-specified data to the buffer, the data 
manager repeating access to the set of data until all data requested by the query processor has 
been accessed, and 

the query processor comprising means for retrieving query-specified data from the buffer. 

20 According to another aspect of the invention there is provided the above query processing system, 
in which the set of data is stored on pages and the data manager further comprises means for 
stabilizing the page on which the query-specified data is located prior to accessing said data, the data 
manager further comprising means for maintaining the stabilization of the page while writing to the 
buffer and until all data requested by the query processor on the page has been accessed. 

25 Advantages of the present invention include a query execution technique requiring reduced page 
stabilization and improved processor utilization. 
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RRnttP DE SCRIPTION "F Tff F DRAWING 

The preferred embodiment of the invention is shown in the drawing, wherein: 

Figure 1 is a flow chart illustrating the steps in query interpretation using the preferred 
embodiment of the invention. 

In the drawing, the preferred embodiment of the invention is illustrated by way of example. It is to 
be expressly understood that the description and drawing are only for the purpose of illustration and 
as an aid to understanding, and are not intended as a definition of the limits of the invention. 



DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

10 Figure 1 is a flow chart diagram illustrating steps in interpreting a query in accordance with the 
preferred embodiment of the invention. Start 10 represents queries, such as SQL statements, that 
are processed to access data in a database. In the example architecture of the preferred embodiment, 
queries are first processed by compiler 12. Compiler 12 provides, query processor 1 4 with an access 
plan based on the nature of the SQL query being compiled. As required, query processor 14 calls 

13 data manager (data management system or DMS) 16 to obtain access to data table 18 (the step of 
accessing data table 18 is sometimes termed a data scan). In the preferred embodiment records or 
rows of data are stored on pages in data table 18. In the preferred embodiment, rows of data table 
18 are available for copying to buffer 20. 

In query processing systems that support concurrent access to data, the location and stabilization of 
20 a page containing data ia a potentially expensive operation. Each time that data management system 
16 stabilises a page in data 1 8, and locates (using a notional cursor in the preferred embodiment) a 
row or record position in die page in data 1 8, there will be a resulting time cost introduced in the 
processing of the query. The prior art includes techniques to allow multiple rows to be read from 
the same page in data 18 without releasing the stabilization or losing the location in the page. These 
25 techniques are typically applied, however, where column data for a row in the table of data 1 8 is not 
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returned to query processor 1 4. Therefore, where & series of rows are read from the table of data 1 8 
and column values for each row are, in fact, returned to query processor 1 4, there is a repeated time 
cost involved in data management system 1 6 repeatedly stabilizing the page and locating the relevant 
rows in the page. 

To improve query execution efficiency where row data is returned to query processor 14, the 
preferred embodiment has buffer 20 that is available for use where rows are read from data 18 and 
are to be returned to query processor 14. According to the preferred embodiment, buffer 20 will be 
used where such techniques as search argument processing do not prevent row data being returned 
to query processor 1 4. Whereas in the prior art, the stabilization of the page containing the row in 
data 18 is released when a row is returned, in the preferred embodiment the retrieved column data 
for the row is copied from data table 18 to buffer 20, rather than being returned directly to query 
processor 14. The page stabilization is not automatically released. 

As will be apparent, use of buffer 20 incorporates the additional steps of copying data values in the 
query execution process. In certain circumstances, however, despite the added cost associated with 

15 the copying of data values to the buffer efficiencies are realized by a reduction in page stabilization 
steps required. In addition, the code which is executed to implement the stepsof copying data values 
to a buffer is anticipated to be significantly smaller than the code required to copy data values for 
use by query processor 14. As a result, instruction and data references are localized which results 
in improved processor utilization and therefore additional efficiencies in the query execution, itself 

20 For these reasons the use of buffer 20 potentially improves performance for a query execution. 

Use of buffer 20 may not be advantageous in all circumstances. Factors which influence the use of 
such a technique include the extent of resource usage and time cost inherent in the copying steps and 
the amount of data estimated to be returned from the data scanning access method. Where data 
found on each page is limited, the advantages of buffer 20 are reduced. 

25 In the preferred embodiment, query processor 14 is designed to ensure that data is retrieved from 
buffer 20 in an appropriate manner. As will be apparent, data may be copied to buffer 20 in response 
to a query processor 1 4 called to data management system 1 6 in a manner which results in buffer 20 
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being partially filled, or in buffer 20 being completely filled. In the implementation of the preferred 
embodiment, where buffer 20 is completely filled, it is also possible for an "extra" set of data values 
(an extra row from the table) to be retrieved by data manager 16. 

Query processor 14 is implemented to process both the "extra" row and the rows in buffer 20. 
3 Where the access plan created by compiler 1 2 requires that the processing of the data maintain the 
sequential order of the rows, the "extra" row must be processed after the rows in buffer 20. To 
accommodate this, the preferred embodiment saves the "extra" row to a single row buffer (not 
shown) to permit these values to be copied to the locations in memory for subsequent query 
evaluation operations. 

10 Where the data values returned by way of buffer 20 are of a size to precisely fit buffer 20, there will 
be no "extra" row returned by the DMS. A flag must be set to permit the query processor 14 to 
recognize that rows exist in buffer 20 and to process these rows. 

The following pseudo code description sets out an implementation of the copying of a row from page 
18 to buffer 20. 

15 DataCopySarg(RdevantCols/*refe to col vals of current row*/) 

if (Buff«Empty - TRUE) 

BufiferUsed=0; 

BufferConsumed=0; 

BufferEmptyFALSE; 
20 RowSize= S! sum(si^eof(relevantCols)); 

if{RowSi2e<=BufferSize-BufferUsed) 

Copy from relevantCols to Buffer[BufferUsed]; 

BufiferUsed=BufferUsed + RowSize; 

retum( CONSUMED ); 
25 else 

retum( DONE ) //buffer full. Access method must copy 

//column values & return to query oval 
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QueryEvalRowScanner() 
ifi[ BufferEmpty -FALSE) 

if(BufferUs«d>BufferConsumcd) 
5 Copy from BuiTer(BufferConsuzned) to HomeBuffers; 

RowSize=sum(si2epf(c<?pied columns)); 

BufferCon^umed = BuSerConsuxned + RowSizc; 

else 

Copy from ExtraRow to HomeBuflfors; 
10 BufferEmpty=TRUE; 

r«um(ROWjVVAILABLE); 
clseif(SeeEOF-TRUE) 

return(EOF); 
else // not non-empty buffer or EOF 
15 AccMethRe$ult=acccssMetood^ 
- - FALSE) 
SeenEOF=TRUE; 

Copy from Bufter(BufTerCon3umed) to HomeBuffers; 
RowSuc=sum(sizedof(copied columns)); 
20 BufferConsumed=BufferConsumed+RowSi2e; return( ROWAVAILABLE); 

else 

rctum( EOF); 

2s ebe//not EOF from access method 

Copy from HomeBuffers to ExtraRow; 



The following table sets out the characteristics of the data objects referred to in the above pseudo- 
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NAME 
Buffer 



10 



ExtraRow 

BufferSize 
BufferUsed 

BufferConsumed 



BuffcrEmpty 
SecnEOF 



HomeBuffers 
ReJevantCoIs 



The following table sets out 
example set out above; 



DESCRIPTION 

The multi-row buffer. The size of me multi-row buffer is 

optimised based on the number of rows expected, the sizes 

of the column data values* and the cost of dedicating 

memory to the multi-row buffer. 

Buffer for the "extra" row returned by that data scanning 

access method 

The size, in bytes, of the multi-row buffer 

The amount, in bytes, of the multi-row buffer currently 

containing data column values from zero or more rows 
The amount, in bytes, consumer from the multi-row 

buffer. Should be less than or equal to BufferUsed 
A flag indicating that the multi-row buffer is empty 
A flag indicating that the scanning access method has 

reported that there are no more rows 

The locations in memory designated to receive the 

relevant data column values of a scanning access method 
The locations in the current page of the scanning access 

method of the columns needed by the copying SARG 
operator 

details of certain procedures referred to in the pseudo-code 



Create home buffers The home locations for the relevant column values 

of rows identified by a data scanning access method 
must be determined and communicated to both the 
data scanning access method and the query 
processor. 

Prepare HomeBuffers list The list of home buffers must be prepared for use by 

the copying SARO operation mechanism. 
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Create Buffer and buffer 



Hie multi-row buffer must be created and the 



control information 



BufferEmpty and SeenEOF flags must be set to true 
and false respectively. 

The information used to control the SARG processor 
must be prepared to include the indication that the 
copying SARG operator is to be invoked if all the 
predicates are TRUE 



Prepare SARG processor 
directives 



5 



As the above example implementation indicates, the code for use of buffer 20 is potentially limited 
and may run in a tight loop with both code and data kept in cache close to a computer processor. For 
this reason, the use of buffer 20 for certain queries will improve performance of the processing of 
the query. 

1 0 Although a preferred embodiment of the present invention has been described here in detail, it will 
be appreciated by those skilled in the art, that variations may be made thereto. Such variations may 
be made without departing from the spirit of the invention or the scope of the appended claims. 
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The embodiments of the invention in which an exclusive property or privilege is claimed are 
defined as follows: 

1 . A method for processing a database query on a set of data in a database management system 
comprising a data manager, a query processor, and a buffer, the method comprising the steps 
5 Of: 

a. the query processor calling the data manager to request the return of data from the set 
of data, 

b. the data manager accessing the set of data to locate query-specified data and 
determining if the query-specified data is to be ignored, consumed, or returned to the 

10 query processor, 

c. where the data manager detcrrnines that the query-specified data is to be returned to 
the query processor, the data manager writing the query-specified data to the buffer, 
the data manager repeating access to the set of data until all data requested by the 
query processor has been accessed, and 

15 d. the query processor retrieving query-specified data from the buffer. 

Zr The method of claim 1 , in which the set of data is stored on pages and the method further 
comprises the step of the data manager stabilizing the page on which the query-specified data 
is located prior to accessing said data, the method further comprising the step of maintaining 
the stabilization of the page while writing to the buffer and until all data requested by the 
20 query processor on the page has been accessed 

3. A program storage device readable by a machine, tangibly embodying a program of 
instructions executable by the machine to perform method steps for processing queries for 
a database, said method steps comprising the method steps of claim 1 or 2. 

4. A computer program product for a database management system comprising a data manager, 
CA9-2000-00IS n 
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a 4ucijr processor puiier ror processing a database query on a sedKata, the computer 
program product comprising a computer usable medium having computer readable code 
means embodied in said medium, comprising 



computer readable program code means for the query processor to call the data manager 
3 to request the return of data from the set of data, 

computer readable program code means for the data manager to access the set of data to 
locate query-specified data and to determine if the query-specified data is to be ignored, 
consumed, or returned to the query processor, 

computer readable program code means for, where die data manager determines that the 
10 query-specified data is to be returned to the query processor, the data manager to write 

the query-specified data to the buffer, and for the data manager to repeat access to the set 
of data until all data requested by the query processor has been accessed, and 

computer readable program code means for the query processor to retrieve query- 
specified data from the buffer. 

5 . The computer program product of claim 4, in which the set of data is stored on pages and in 
which the computer usable medium having computer readable code means embodied in said 
medium, further comprises computer readable program code means for the data manager to 
stabilize the page on which the query-specified data is located prior to accessing said data, 
and for the data manager to maintain the stabilization of the page while writing to the buffer 
and until all data requested by the query processor on the page has been accessed 

6. A query processing system comprising a data manager, a query processor and a buffer for 
processing a database query on a set of data, 

the query processor comprising means for calling the data manager to request the return 
of data from the set of data, 
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the data manager comprising means for accessing the set of data to locate query-specified 
data and for determining if the query-specified data is to be ignored, consumed, or 
returned to the query processor, 

the data manager comprising means for, where determining that the query-specified data 
5 is to be returned to the query processor, writing the query-specified data to the buffer, the 

data manager repeating access to the set of data until ail data requested by the query 
processor has been accessed, and 

the query processor comprising means for retrieving query-specified data from the 
buffer. 

10 7. The query processing system of claim 6, in which the set of data is stored on pages and the 
data manager further comprises means for stabilizing the page on which the query-specified 
data is located prior to accessing said data, the data manager further comprising means for 
maintaining the stabilization of the page while writing to the buffer and until all data 
requested by the query processor on the page has been accessed 

15 
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improved otmnv execution in query processing systems 



ABSTRACT 



A query processing system having a data manager, and a query manager also includes a buffer. The 
5 query manager calls the data manager to access data based on a query. Where there is no predicate 
check or consumption operation on the record accessed, the data manager will notionally return the 
record to the query manager. However, the data manager accomplishes the return by writing the 
relevant portions of the record to a buffer. The data manager maintains stabilization of the page 
containing the record while the buffer is being written to. The data manager continues to access 
l o records on the stabilized page and to write such records to the buffer where appropriate. The query 
manager retrieves the records from the buffer after the data manager has completed its operation 
resulting from the query manager call 
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